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A tetrahedral space-filling curve for non-conforming adaptive meshes 

CARSTEN BURSTEDDE* AND JOHANNES HOLKE*t 


Abstract. We introduce a space-filling curve for triangular and tetrahedral red-refinement that 
can be computed using bitwise interleaving operations similar to the well-known Z-order or Morton 
curve for cubical meshes. To store sufficient information for random access, we define a low-memory 
encoding using 10 bytes per triangle and 14 bytes per tetrahedron. We present algorithms that 
compute the parent, children, and face-neighbors of a mesh element in constant time, as well as the 
next and previous element in the space-filling curve and whether a given element is on the boundary 
of the root simplex or not. Our presentation concludes with a scalability demonstration that creates 
and adapts selected meshes on a large distributed-memory system. 
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1. Introduction. Conforming adaptive mesh refinement for simplicial (trian¬ 
gular and tetrahedral) meshes is one of the most successful concepts in numerical 
mathematics and computational science and engineering; see, e.g., [3,21,50]. Sim- 
plices provide high flexibility in meshing to arbitrary domain geometries [46, 47] and 
can be mapped to a reference simplex using an elementwise constant Jacobian in 
most cases, which allows for an efficient numerical implementation. Discretization 
and integration methods of various kinds are available for both low and high orders 
of accuracy. 

Large-scale scientific computing requires fast and scalable algorithms for (1) adap¬ 
tive refinement and coarsening (AMR) as well as (2) parallel partitioning. One class 
of methods for AMR exploits the properties of Delaunay triangulations [23,26,45], 
while partitioning of unstructured meshes is often approached by formulating the 
mesh topology as a graph. These triangulations are usually conforming; that is, ele¬ 
ments intersect only along whole faces and edges. Common application codes such as 
FEniCS [33], PLUM [35,36], OpenFOAM [37], or MOAB from the SIGMA toolkit [53] 
delegate graph partitioning to third-party software like Par METIS [29] or Scotch [20] . 
Graph-based algorithms have been advanced to target millions of processes and bil¬ 
lions of elements [18,42,49]. Still, increasing the scalability and decreasing the absolute 
runtime and memory demands of distributed implementations remains a challenge, 
and the lack of an obvious parent-child structure in many unstructured meshing ap¬ 
proaches prevents certain use cases. 

Nonconforming AMR in combination with recursive refinement makes refinement 
and coarsening nearly trivial operations. The additional mathematical logic to enable 
hanging faces and edges is well understood for both continuous and discontinuous 
discretizations [1,22,30,43]. It is local to the loop over the finite elements or volumes 
and transparent to most of the numerical pipeline, thus offering the possibility to 
extend existing conforming codes. The resolution may be as coarse as any chosen 
root mesh (a mesh which is not intended for further coarsening), which poses only a 
slight limit to geometric flexibility. 

The challenge of efficient partitioning of meshes may be addressed using space- 
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filling curves (SFC). Instead of working on the NP-hard graph partitioning problem 
[12], SFCs are used to approximately solve the partitioning problem in linear runtime 
[4,57]. Common SFCs establish an equivalence between the adaptive mesh and a 
tree, where the mesh elements correspond one-to-one to the leaves of the tree. They 
also define a total order among the leaves that can be used to store application data 
linearly in the same order as the elements [15]. The original concept is specified for 
a cubic domain and can be traced back to over a hundred years ago [25,39] . The 
Sierpinski curve [48] and generalized versions of it are the most common SFCs on 
simplicial meshes. These were described by Bansch [8], Kossaczky [31], and others; 
see e.g. [4,44]. The computation of parents and children, for example, usually involves 
a loop over the refinement levels, equivalent to following the path between the tree 
root and a leaf. 

When using hypercubes as mesh primitives, the logic of common SFCs can be 
formulated with remarkable simplicity due to the local tensor product structure [39, 
51,52,54,57]. It is also well-suited to compute topological entities like face-neighbors, 
children, and parents of given mesh elements. The Morton curve [34], in particular, 
allows one to design algorithms for these tasks (see, e.g., [16]) whose runtime is level- 
independent, i.e., does not depend on the depth of the refinement tree. Moreover, 
SFCs are memory efficient, since for a mesh element only the coordinates of one 
anchor node plus its refinement level have to be stored. Today, this principle has 
found its way into widely used software libraries [5,6]. Memory usage can be further 
optimized by incremental encoding along the SFC [4,13,55]. 

Using SFCs on hexahedral meshes is exceptionally fast and scalable [14,28,41]. 
This fact has not only been exploited in writing simulation codes using hexahedral 
meshes, but also by approaches that use the hexahedral SFC as an instrument to 
partition simplices, mapping them into a surrounding cube [2]. However, hexahedral 
AMR is more limited geometrically than simplicial AMR. One indication for this is 
the (lack of) availability of (open-source) mesh generators that operate exclusively 
on cubes. Furthermore, assuming an existing simplicial code, rewriting it in terms of 
cubical meshes will be out of the question if the code is of sufficient size, complexity, 
or maturity. For these reasons, in this paper, we attempt to establish a new SFC 
for triangles and tetrahedra that has many of the favorable properties known for 
hexahedra. 

Our starting point is to divide the simplices in a refined mesh into two (two 
dimensions, 2D), respectively six (three dimensions, 3D), different types and selecting 
for each type a specific reordering for Bey’s red-refinement [9]. This type and the 
coordinates of one vertex serve as a unique identifier, the Tet-id, of the simplex in 
question. In particular, we do not require storing the type of all parent simplices to 
the root, as one might naively imagine. We then propose a Morton-like coordinate 
mapping that can be computed from the Tet-id and gives rise to an SFC. Based on 
this logic, we develop constant-time algorithms (that is, not requiring a loop over 
the refinement levels) to compute the tetrahedral-Morton identifiers of any parent, 
child, face-neighbor, and SFC-successor/predecessor and to decide whether for two 
given elements one is an ancestor of the other. We conclude with scalability tests of 
mesh creation and adaptation with over 8.5 x 10^^ tetrahedral mesh elements on up 
to 131,072 cores of the JUQUEEN supercomputer and 786,432 cores of MIRA. 

2. Mesh refinement on simplices. Our aim is to define and examine a new 
SFC for triangles and tetrahedra by adding ordering prescriptions to the noncon¬ 
forming Bey-refinement (also called red-refinement) [9,10,56]. We briefly restate the 
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Fig. 1. Left: The refinement scheme for triangles in two dimensions. A triangle T = 
lxo,Xi,X2] C is refined by dividing each face at the midpoints Xij . We obtain four smaller 
triangles, all similar to T. Right: The situation in three dimensions. If we divide the edges of the 
tetrahedron T = [aloi 5?! > ^3] C in half, we get four smaller tetrahedra (similar to T) and an 

inner octahedron. By dividing the octahedron along any of its three diagonals (shown dashed) we 
finally end up with partitioning T into eight smaller tetrahedra, all having the same volume. The 
refinement rule of Bey is obtained by always choosing the diagonal from xq 2 to X 13 and numbering 
the corners of the children according to (2). 


red-refinement in this section and contrast it with the well-known conforming (or 
red/green) refinement. 

We refer to triangles and tetrahedra as d-simplices, where d C {2,3} specifies the 
dimension. It is sometimes convenient to drop d from this notation. A d-simplex T C 
is uniquely determined by its d-|-l afHne-independent corner nodes xq, ... ,Xd & 
Their order is signihcant, and therefore we write 

(la) T = [xgi,xi,X 2 ] in 2D, 

(lb) T =[xgi,xi,X 2 ,X'i] in 3D. 

We define Xq as the anchor node of T. By Xij we denote the midpoint between x^ 
and Xj, thus Xij = ^(xi + Xj). 

2.1. Bey’s refinement rnle. Bey’s rule is a prescription for subdividing a sim¬ 
plex. It is one instance of the so-called red-refinement, where all faces of a simplex 
are subdivided simultaneously. 

Definition 1. Given a d-simplex T = [afo, • ■ ■ ,Xd] C K'’*, the refinement rule of 
Bey consists of cutting off four subsimplices at the corners (as in Figure 1). In 3D 
the remaining octahedron is then divided along the diagonal from a ?02 to a?i 3 . Bey 
numbers the 2^ resulting subsimplices as follows. 



To 

[xo, Xoi: ^ 02 ]: 

Ti 

[^01 1 2 ^ 11 ^ 12 ], 


T2 

:= [X02, ^12, ^ 2 ], 

To 

'■= [^ 01 ,^ 02 ,^ 12 ], 

To 

:= 

[xq ; 3^01 5 ^02: ^ 03 ]; 

Ti 

:= [afoi,afo2,^ 03 ,^ 13 ], 

Ti 

:= 
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Fig. 2. The basic triangle (2D) and tetrahedra types (3D) obtained by dividing [0,1]'* into 
simplices of varying types, denoted by a subscript. Top left: The unit square can be divided into two 
triangles sharing the edge from (0,0)^ to (1,1)^. We denote these triangles by Sq and Si. The 
four corners of the square are numbered cq, ..., C3 in yx-order. Top right: The comer nodes of So 
and Si in terms of the square corners. Bottom left (exploded view): In three dimensions the unit 
cube can be divided into six tetrahedra, all sharing the edge from the origin to (1,1,1)^. We denote 
these tetrahedra by So,..., S 5 . The eight corners of the cube are numbered cq, ... , C7 in zyx-order 
(redrawn and modified with permission [9]). Bottom right: The corner nodes of the six tetrahedra 
So,..., S 5 in terms of the cube corners. 


Definition 2. In this paper, a refinement of a d-simplex S denotes a set 5^ of 
d-dimensional subsimplices of S that can be constructed from S via successive re¬ 
finement, where only the finest simplices belong to the actual refinement. Thus all 
refinements can be obtained applying the following rules: 

• = {S} is a refinement of S. 

• If is a refinement of S, then 5 ^ = (= 5 ^'\ { T }) U { Tg, ..., T 2 d_i } is a 
refinement for every T G . 

We explicitly allow nonuniform refinements and thus nonconforming faces and edges. 

Definition 3. The Ti from (2) are called the children of T, and T is called the 
parent of the Ti, written T = P{Ti). Therefore, we also call the Ti siblings of each 
other. If a d-simplex T belongs to a refinement of another d-simplex S, then T is a 
descendant of S, and S is an ancestor ofT. The number I of refining steps needed 
to obtain T from S is unique and called the level of T (with respect to S); we write 
£ = £{T). Usually S is clear from the context, and therefore we omit it in the notation. 
By definition, T is an ancestor and descendant of itself. 

Consider the six tetrahedra Sq, ...,85 C displayed in Figure 2. These tetra¬ 
hedra form a triangulation of the unit cube. The results and algorithms in this paper 
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Fig. 3. Triangulating a cube according to Figure 2 and then refining the tetrahedra via Bey’s 
refinement rule results in the same mesh as first refining the cube into eight subcubes and afterward 
triangulating each of these cubes. Each occurring tetrahedron is uniquely determined by the subcube 
it lies in plus its type. The same situation can be observed in 2D if we restrict our view to one side 
of the cube. 


rely on the following property [9]. 

Property 4. Refining the six tetrahedra from the triangulation of the unit cube 
simultaneously to level i results in the same mesh as first refining the unit cube to level 
I and then triangulating each smaller cube with the six tetrahedra Sq, ■ ■ ■ ■, S^, scaled 
by a factor of 2~^ (see Figure 3). The same behavior can be observed in 2D when the 
unit square is divided into two triangles. 

2.2. Removal of hanging nodes using red/green refinement. It is worth 
noting that, although the methods and algorithms presented in this paper apply to 
red-refined meshes with hanging nodes, it is possible to augment them to create meshes 
without hanging nodes. For this we may use red/green or red/green/blue refinement 
methods [7,17]. 

After the red-refinement step we may add an additional and possibly nonlocal 
refinement operation that ensures a maximum level difference of 1 between neigh¬ 
boring simplices. Such an operation is also called 2:1 balance [27,52,54]; its full 
description, however, would exceed the scope of the present paper. Hanging nodes 
are then resolved by bisecting those simplices with hanging nodes (green/blue refine¬ 
ment) [4, section 12.1.3]. The 2D case is shown in Figure 4. 

If one of the newly created simplices shall be further refined, the bisection is 
reversed, the original simplex is red-refined, and the balancing and green-refinement 
is repeated. This may void the nesting property of certain discrete function spaces, 
yet applications may still prefer this approach over the manual implementation of 
hanging node constraints. 

3. A tetrahedral Morton index. The Morton index or Z-order for a cube in 
a hexahedral mesh is computed by bitwise interleaving the coordinates of the anchor 
node of the cube [34]. In this section we present an index for c?-simplices that also 
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Fig. 4. To resolve hanging nodes we can execute an additional step of green- or blue-refinement 
after the last refinement step [7,17]. Here we show the 2D refinement rules. Left: green (1 hanging 
node). Right: blue (2 hanging nodes). If a triangle has 3 hanging nodes it is red-refined. 


uses the bitwise interleaving approach, the tetrahedral Morton index (TM-index). To 
define the TM-index we look at refinements of a reference simplex, which we discuss 
in Section 3.1 below. For each d-simplex in a refinement of the reference simplex we 
define a unique identifier, the so-called Tet-id, which serves as the input to compute the 
TM-index and for all algorithms related to it. This Tet-id consists of the coordinates 
of the anchor node of the considered simplex plus one additional number, the type 
of the simplex. We define the Tet-id and type in Section 3.2. We then define the 
TM-index in Section 3.3 and in the following subsections. We show that it possesses 
locality properties similar to those of the Morton index. One novel aspect of this 
construction lies in logically including the types of the simplex and all its parents in 
the interleaving, while only using the type of the simplex itself in the algorithms. 

3.1. The reference simplex. Throughout the rest of this paper, let £ be a 
given maximal refinement level. Instead of the unit cube [0, l]'^, we consider the 
scaled cube [0,2'^]'^, ensuring that all node coordinates in a refinement up to level C 
are integers. Suppose we are given some d-simplex T C together with a refinement 

of T. By mapping T affine-linearly to the refinement is mapped to a 

refinement 5^' of 2^Sq. Therefore, to examine SFCs on refinements of T, it suffices 
to examine SFCs on 2'^S'o. Thus, we only consider refinements of the d-simplex 

:= 2^Sq. Let Td be the set of all possible descendants of this d-simplex with level 
smaller than or equal to £; thus 

(4) 7d = { r I T is a descendant of with 0 < i{T) < £ } . 

Any refinement (up to level £) of is a subset of 7d, and for each T G 7d there 
exists at least one refinement of with T G 3^. In this context, we refer to 
as the root simplex. Furthermore, let denote the set of all possible anchor node 
coordinates of d-simplices in Td, thus 

L2 = ([0,2^)2nZ2|y<x}, 

L3 = {[0,2^)3nZ3jj/<z<x}. 

Note that we could have chosen any other of the Si (scaled by 2^) as the root simplex 
and we do not see any advantage or disadvantage in doing so. 

3.2. The type and Tet-id of a d-simplex. Making use of Property 4, we 
define the following. 

Definition 5. Each d-simplex T G Td of level i lies in a d-cube of the hexahedral 
mesh that is part of a uniform level I refinement of [0, 2^Y. This specific cube is the 
associated cube of T and denoted by Qt- The d-simplex T is a scaled and shifted 
version of exactly one of the six tetrahedra Si that constitute the unit cube, and we 
define the type ofT as this number, type(T) := i. 
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Table 1 

For a d-simplex T of type b the table gives the types Ct{To), ■ ■ ■, Ct{T 2 d_i) ofT's children. The 
comer-children Tq,Ti,T 2 (and in 3D also T 3 ) always have the same type as T. 


The anchor node of a subcube of level (, is the particular corner of that cube with the 
smallest x-, y- (and z-) coordinates. This means that for each simplex T in the refine¬ 
ment from Figure 3 the anchor node of T and the anchor node of its associated cube 
coincide. Any two d-simplices in Td with the same associated cube are distinguishable 
by their type. 

From Bey’s observation from Figure 3 it follows that any simplex in Td can be 
obtained by specifying a level i, then choosing one level £ subcube of the root cube 
and finally fixing a type. This provides motivation for the following definition. 

Definition 6 (Tet-id). For T = [xo,...,Xd] € Td we define the Tet-id of T as 
the tuple of its anchor node and type; thus 

(6) Tet-id{T) := {xo,type{T)). 

Corollary 7. Let T,T' C Td- Then T = T' if and only if their Tet-ids and 
levels are the same. 

Note that in an arbitrary adaptive mesh there can be simplices with different levels 
and each simplex T has an associated cube of level ^(T). In particular, simplices with 
the same anchor node can have different associated cubes if their levels are not equal. 

Since any simplex in Td can be specified by the Tet-id and level, the Tet-id provides 
an important tool for our work. The construction of the TM-index in the next section 
and the algorithms that we present in Section 4 rely on the Tet-id as the fundamental 
data of a simplex. All information about a mesh can be extracted from the Tet-id 
and level of each element. 

Since the root simplex has type 0, in a uniform refinement more simplices have 
type 0 than any other type. However, a close examination of Table 1 together with a 
short inductive argument leads to the following proposition. 

Proposition 8. In the limit £ —>■ 00 the different types of simplices in a uniform 
level £ refinement of Td occur in equal ratios. 

3.3. Encoding of the tetrahedral Morton index. In addition to the anchor 
coordinates the TM-index also depends on the types of all ancestors of a simplex. In 
order to define the TM-index we start by giving a formal definition of the interleaving 
operation and some additional information. 

Definition 9. We define the interleaving a T b of two n-tuples a = {on-i, 
...,ao) and b = ( 6 „_i,..., 60 ) as the 2n-tuple obtained by alternating the entries 














of a and b: 

(7) 
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a ± b:={an-i,bn-i,. ■ ■ ,ao,bo). 


The interleaving of more than two n-tuples is defined analogously as the 

mn-tuple 


( 8 ) 




,<)■ 


Remark 10. The TM-index of a d-simplex T G Td that we are going to define 
is constructed by interleaving d + 1 L-tuples, where the first d are the binary repre¬ 
sentations of the coordinates of T’s anchor node and the last is the tuple consisting 
of the types of the ancestors ofT. 

Definition 11. Let T G be a tetrahedron of refinement level I with anchor 
node xo = {x,y,z)'^ G L^. Since x,y,z G No with 0 < x,y, z < 2^, we can express 
them as binary numbers with C digits, writing 

c-i c-i c-i 

(9) x ='^ Xj 2 \ y ='^ yj 2 ^, z = '^ Zj 2 T 

j=0 j=0 j=0 


We define the L-tuples X, Y, and Z as the L-tuples consisting of the binary digits of 
X, y, and z; thus, 


(10a) 

X = X{T) 

= {xc-i,- 

..,Xo) 

(10b) 

Y = Y{T) 

= {yc-i, 


(10c) 

Z = Z{T) 

= {zc-i, ■ 

..,Zo). 


In 2D we get the same definitions with X and Y, leaving out the z-coordinate. 

Definition 12. For a T G Td of level £ and each 0 < j < £ let T be the (unique) 
ancestor of T of level j. In particular, T^ = T. We define B{T) as the L-tuple 
consisting of the types of T’s ancestors in the first £ entries, starting with T^. The 
last L — £ entries of B{T) are zero: 


( 11 ) 


B = B{T) := type(Ti), type(T2),..., type(r), 0,..., 0 


Thus, if we write B as an L-tuple with indexed entries bi 

(12) B = B{T) = {bc-i,...,bo) G {0,...,d!-l}^, 


then the ith entry bi is given as 


(13) 


type(r^ *) 
0 


L-l>i>L-£, 

L-£>i>0. 


Definition 13 (tetrahedral Morton Index). We define the tetrahedral Morton 
index (TM-indexJ m(T) of a d-simplex T G Td as the interleaving of the L-tuples Z 
(for tetrahedra), Y, X and B. Thus, 


(14a) 


m{T) :=YYXYB 
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for triangles and 

(14b) m{T) := Z ±Y 1 X ± B 


for tetrahedra. 

This index resembles the well-known Morton index or Z-order for d-dimensional cubes, 
which we denote by fh here. For such a cube Q the Morton index is usually defined 
as the bitwise interleaving of its coordinates. Thus m{Q) = Z ZY J-X, respectively, 
m{Q) =Y±X; see [16,34,52]. 

As we show in Section 4, the TM-index can be computed from the Tet-id of T 
with no further information given. Thus, in an implementation it is not necessary to 
store the £-tuple B. 

The TM-index of a d-simplex builds up from packs of d bits Zi (for tetrahedra), 
Ui, and Xi followed by a type bi G { 0,..., d! — 1}. Since d! = 2 < 4 for d = 2, we can 
interpret the 2D TM-index as a quarternary number with digits {yiXi )2 and bi'. 


(15a) 


"i('r) = {{yc-ixc-i)2,bc-i, ■. ■ ,{yoxo)2,bo)4 

= {{2y^ + + b.A^’-) . 

i =0 


Similarly we can interpret it as an octal number with digits {ziyiXi )2 and bi for d = 3^ 
since then d! = 6 < 8: 


(15b) 


MT) = {{zc-iyc-iXc-i) 2 ,bc-i,---,{zoyoXo) 2 ,bo )8 
C-1 

= 'y ' ((4^i + 2?/i + . 

i =0 


The entries in these numbers are only nonzero up to the level i of T, since xc-i = 
yc-i = {zc-i =)bc-i = 0 for all i > £. The octal/quarternary representation (15) 
directly gives an order on the TM-indices, and therefore it is possible to construct an 
SFC from it, which we examine further in Section 3.6. We use m{T) to denote both 
the (d + l)£-tuple from (14) and the number given by (15). 

Let us look at Figure 5 for a short example to motivate this definition of the TM- 
index. Since the anchor coordinates and the type together with the level uniquely 
determine a d-simplex in 7d, one could ask why we do not define the index to be 
{{Z L)Y -La, type(r)), a pair consisting of the Morton index of the associated cube 
of T and the type of T. This index was introduced for triangles in a slightly modified 
version as semiquadcodes in [38] and would certainly require less information since 
the computation of the sequence B would not be necessary. However, it results in an 
SFC that traverses the leaf cubes of a refinement in the usual Z-order and inside of 
each cube it traverses the d! different simplices in the order S'o,..., S'di-i. As a result, 
there can be simplices T whose children are not traversed as a group, which means 
that there is a tetrahedron T', which is not an ancestor or descendant of T, such that 
some child Ti of T is traversed before T' and T' is traversed before another child Tj 
of T. Theorem 16 states that this problem does not occur with the TM-index that 
we have defined above. Figure 5 compares the two approaches for a uniform level 2 
refinement of T®. 

3.4. A different approach to derive the TM-index. There is another in¬ 
terpretation of the TM-index, which is particularly useful for the AMR algorithms 
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Fig. 5. Comparing a straightforward definition of a Morton-type SFC with our approach. Left: 
The SFC arising from taking the Morton order of the quadrants and only dividing into triangles on 
the last level. Thus the index is (Y type(T)). As we see on the two coarse triangles that are 
shaded, the children of a level 1 triangle are not necessarily traversed before any other triangle is 
traversed. Such a locality property would be desirable, and therefore this index is not suitable for 
our purposes. Right: The SFC arising from the TM-index from our Definition 13. We see that for 
each level 1 triangle its four children are traversed as a group. Theorem 16 states that this property 
holds for any parent triangle/tetrahedron. The order in which the children are traversed depends 
(only) on the type of the parent and is different from Bey’s order given by (2). 


presented in Section 4. In order to define it we introduce the concept of the so-called 
cube-id. According to Figure 2 we number the 2“^ corners of a d-dimensional cube by 
Co,... ,C 2 <j-i in a zyx-ordei {x varies fastest). When refining a cube to 2'^ children, 
each child has exactly one of the Ci as a corner, and it is therefore convenient to 
number the children by cq , ... ,C2d-i as well. For the child Ci we call the number i 
the cube-id of that child; see Figure 6 for an illustration. Since each cube Q that 
is not the root has a unique parent, it also has a unique cube-id. This cube-id can 
easily be computed by interleaving the last significant bits of the z- (in 3D), y-, and 
cc-coordinates of Q’s anchor node. 

Definition 14. Because each d-simplex T S Xd of level £ has a unique associated 
cube we define the cube-id of T to be the cube-id of the associated cube of T , that is, 
the d-cube of level £ that has the same anchor node as T . 

If X, Y (and Z) are as is in Definition II then we can write the cube-id of T’s ancestors 
as 

cube-id(r*) = (yiXi )2 in 2D, 

(16) 

cube-id(r®) = {ziyiXi )2 in 3D, 
and therefore using (15) we can rewrite the TM-index of T as 

(17) m(T) = (cube-id(r^), type(r^),..., cube-id(T^), type(T^), 0,..., 0)2<i. 

This resembles the Morton index of the associated cube Qt of T, since we can write 
this as 

(18) fhiQr) = (cube-id(Q^),..., cube-id((5^), 0 ,..., 0 ) 2 <j- 

3.5. Properties of the TM-index. In this section we show that the TM-index 
has properties similar to that of the Morton index for cubes. We are particularly in¬ 
terested in locality properties stating that refining a simplex only locally changes the 
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Fig. 6. Left: A square is refined to four children, each of which corresponds to a corner of the 
square. The number of the corner is the cube-id of that child. Right: In three dimensions a cube is 
refined to eight children. Their cube-ids and corner numbers are shown as well. 


order given by the TM-index. We begin by stating the uniqueness of the TM-index 
if additionally a level £ is given. 

Proposition 15. Together with a refinement level £, the TM-index m{T) unique¬ 
ly determines a d-simplex in Td- 

Proof. If £ = 0, then there is only one simplex of level £ in Td, which is T^. So 
let £ > 0 and m = m(T) be given as in (14), and let £ be the level of T. From m we 
can compute the x- ,y- (and z-) coordinates of the associated cube of T. We can also 
compute the type of T from the TM-index. By Corollary 7 this information uniquely 
determines T. □ 

For the Morton index fh for cubes the following important properties are known [52] : 

(i) The Morton indices of the descendants of a parent cube are larger than or equal 
to the index of the parent cube. 

(ii) A Morton index of a cube Q is the prehx of an index of a cube P of higher level 
than Q if and only if P is a descendant of Q. 

(iii) Refining only changes the SFC locally. Thus, if Q is a cube and P is a cube 
with rri{Q) < rh{P) and P is not a descendant of Q, then m{Q') < rh{P) for 
each descendant Q' oiQ. 

Property (iii) defines a hierarchic invariant of the SFC that is specific to our con¬ 
struction (see Figure 5). We show below that properties (i), (ii) and (iii) hold for 
d-simplices and the TM-index described by (14). 

Theorem 16. For arbitrary d-simplices T S ^ Td the TM-index satisfies the 
following: 

(i) If S is a descendant ofT then m{T) < m(S). 

(ii) If £{T) < £{S), then m{T) is a prefix of m(S) if and only if S is a descendant 
ofT. 

(iii) If m(T) < m(S) and S is no descendant ofT, then for each descendant T' ofT 
we have 

(19) m(r) < m(T') < m{S). 

The proof of Theorem 16 requires some work and we need to show a technical result 
first. Hereby, we consider only the 3D case, since for 2D the argument is completely 
analogous. We define an embedding of the set of all TM-indices into the set of Morton 
indices for 6D cubes. Since the properties (i)-(iii) hold for these cubes it follows that 
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they hold for tetrahedra as well. To this end, for a given tetrahedron T G Ta we 
interpret each entry bj of B{T) as a 3-digit binary number 

( 20 ) b, = 

which is possible since bj G {0, We obtain three new £-tuples 

satisfying 

(21) B = B^ Lb^ L B°, 
and thus we can rewrite the TM-index as 

(22) m(T) = z1y1x1b‘^1b^1b°. 

Note that we can interpret each i?* as an /1-digit binary number for which we have 
0 < i?® <2^. Now let Q denote the set of all 6D cubes that are a child of the cube 

Qo:=[0,2^]6: 


(23) 


Q = {(5 I Q is a descendant of Qo of level 0 < £ < £ } . 


Since a cube Q G Q is uniquely determined by the six coordinates {xq, ... of 
its anchor node plus its level £, we also write Q = Q{xQ,...,x^),e- Note that the 
Morton index for a cube can be defined as the bitwise interleaving of its anchor node 
coordinates [34]: 

(24) m(Q) = ± ± ± ± 1 X°. 

Proposition 17. The map 


(25) 


$: 


% 

T 


is injective and satisfies 
(26) 


Q, 

Q{B°{T),B^(T),B‘^(T),x{T),y{T),z{T)),t{T) 


m($(T)) = m(T). 


Furthermore, it fulfills the property that T' is a child of T if and only if $(T') is a 
child of $(r). 

Proof. The equation m{T) = fh{^{T)) follows directly from the definitions of the 
TM-indices on Tjj and Q. From Lemma 15 we conclude that $ is injective. Now let 
T',T G Ta, where T' is a child of T. Furthermore, let t = £(T). We know that 
Q' := $(T') is a child of Q := $(T) if and only if for each t G { 0,..., 5 } it holds that 


(27) 


xfiQ') G ^Xi{Q),xfiQ) + 2^ 


Because of the underlying cube structure (compare Figure 3) we know that the x- 
coordinate of the anchor node of T' satisfies 


(28) 


:(T') G { x(T),x{T) + } , 


and likewise for Y{T') and Z{T'). Therefore, (27) holds for i = 3,4,5. By definition 
B^{T') is the same as B^{T) except at position C — {£ + 1), where 


(29) 


B^(r')£-(m)=6):_(,+i)(T')G{0,l} 
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and 

(30) B^{T)c-(i+i)=Q. 

Hence, we conclude that (27) also holds for i = 0,1, 2. So ^{T') is a child of $(T). 

To show the other implication, let us assume that $(r') is a child of <i)(T). Since 
t'(T') = > 0, T' has a parent and we denote it by P. In the argument above 

we have shown that $(P) is the parent of $(T') and because each cube has a unique 
parent the identity $(P) = $(r) must hold. Therefore, we get P = T since $ is 
injective; thus, T' is the child of T. □ 

Inductively we conclude that T' is a descendant of T if and only if <I>(T') is a descen¬ 
dant of $(T). Now Theorem 16 follows, because the desired properties (i)-(iii) hold 
for the Morton index of cubes. 

3.6. The space-filling curve associated to the TM-index. By interpreting 
the TM-indices as 2'^-ary numbers as in (15) we get a total order on the set of all 
possible TM-indices, and therefore it gives rise to an SFC for any refinement 5^ of 
T^. In this section we further examine the SFC derived from the TM-index. We give 
here a recursive description of it, similarly to how it is done for the Sierpinski curve 
and other cubical SFC by Haverkort and van Walderveen [24]. 

Part (hi) of Theorem 16 tells us that the descendants of a simplex T are traversed 
before any other simplices with a higher TM-index than T are traversed. However, the 
order that the children of T have relative to each other can be different to the order 
of children of another simplex T'. In particular the order of the simplices defined by 
the TM-index differs from the order (2) defined by Bey. We observe this behavior in 
2D in Figure 5 on the right-hand side: For the level 1 triangles of type 0 the children 
are traversed in the order 

(31) To,ri,r3,T2 

and the children of the level 1 triangle of type 1 are traversed in the order 

(32) ro,r3,Ti,r2. 

In fact, the order of the children of a simplex T depends only on the type of T, as we 
show in the following Proposition. 

Proposition 18. If T,T' G Td are two d-simplices of given type b = type(T) = 
type(r'), then there exists a unique permutation a = ah o/ { 0 ,..., 2“^ — 1 } such that 

miT^(o)) < < ■ ■ ■ < 

(33) and 

< HKii)) <■■■< 

Thus, the children of T and the children of T' are in the same order with respect to 
their TM-index. 

Proof. By ordering the children of T and T' with respect to their TM-indices, we 
obtain a and a' with 


miTcriO)) < miTail)) < ■ ■ ■ < TO('7)T(2‘i-l)), 
MK'(o)) < "*(^a'(i)) < • • • < 


(34) 
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These permutations are well-defined and unique with this property because different 
simplices of the same level never have the same TM-index; see Proposition 15. It 
remains to show that a' = cr. Let t = i{T) and t! = £(T'). Since the TM-indices of 
the children of T do all agree up to level £, we see, using the notation from (15), that 
their order a depends only on the d 1 numbers {z is omitted for d = 2) 

(35) yc-(i+i){Ti), xc-{E+i){Ti) and &£-(^+i)('7i)- 

The same argument applies to a' and t'. From now on we carry out the computations 
for d = 3. Since type(T) = type(r') we can write 

(36) T = XT' + c. 


with 


(37) 


A = 2"^ 


c = 


jx{T) - x{T')\ 
y{T) - y{T') . 

\z{T)-z{T)) 


Since the refinement rules (2) commute with scaling and translation we also obtain 


(38) 


T, = AI)' + c 


for the children of T and T' and therefore 


(39) hc-{f.+i){Ti) = type(r,) = type(7;') = 
for 0 < i < 2'^. Furthermore, we have 

(40) xc-ii+i){T,) = (x{Ti) - a;(r))2-^+(^+i) 
from which we derive 

X{x{T[) - a;(r))2-^+(^+i) 

2^'-\x{Tl) - x{T'))2-^+^^+^'> 

(x(T') - x(r))2-^+(^'+i) 

and analogously 

yc-{i+i){Ti) = yc-{i'+i){Tl), 

zc-(i+i){Ti) = zc-(e'+i){T'). 

This shows that the tetrahedral Morton order of the children of T and T' are the 
same and o' must equal a. □ 

Definition 19. Let T a Td such that T’s parent P has type b and T is the ith 
child of P according to Bey’s order (2), 0 < * < 2'^. We call the number ab{i) the 
local index of the d-simplex T and use the notation 

(43) /loc(r) := abii) 

to denote the child number in the TM-ordering, subsequently written TM-child. By 
definition, the local index of the root simplex is zero, /loc(Tj) := 0. Table 2 lists the 
local indices for each parent type. 


xc-(e+i){Ti) — 


(41) 
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-^loc 

3D 

To 

Ti 

T2 

Child 

Ta T4 

Ts 

To 

Ty 

0 

0 

1 

4 
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2 

3 

6 
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1 

0 

1 

5 

7 

2 

3 

6 

4 

b 2 
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3 

4 

7 

I 

2 

6 

5 

3 

0 

I 

6 

7 

2 

3 

4 

5 

4 

0 

3 

5 

7 

I 

2 

4 

6 

5 

0 

3 

6 

7 

2 

1 

4 

5 


Table 2 


-^loc 

2D 

Child 

To Ti T2 Ta 

b « 

1 

0 13 2 

0 2 3 1 


The local index for the children of a d-simplex T according to the TM-ordering. For each type b, 
the 2^ children Tq, ..., T 2 d_i of a simplex of this type can be ordered according to their TM-indices. 
The position of Ti according to the TM-order is the local index 




Fig. 7. Left: Using the notation from [24] we recursively describe the SFC arising from the 
TM-index for triangles. The number inside each child triangle is its local index. R denotes the 
refinement scheme for type 0 triangles and F for type 1 triangles. This pattern can be obtained from 
Tables 1 and 2. Right: The SFC for an example adaptive refinement of the root triangle. 


Thus, we know for each type 0 < 6 < d! how the children of a tetrahedron of type 
b are traversed. This gives us an approach for describing the SFC arising from the 
TM-index in a recursive fashion [24] . By specifying for each possible type b the order 
and types of the children of a type b simplex, we can build up the SFC. In Figure 7 
we describe the SFC for triangles in this way. In three dimensions it is not convenient 
to draw the six pictures for the different types, but the SFC can be derived similarly 
from Tables I and 2. 

4. Algorithms on simplices. In this section we present fundamental algo¬ 
rithms that operate on d-simplices in Td- These algorithms include computations of 
parent and child simplices, computation of face-neighbors and computations involved 
with the TM-index. To simplify the notation we carry out all algorithms for tetrahe- 
dra and then describe how to modify them for triangles. We introduce the data type 
Tet and do not distinguish between the abstract concept of a Tet and the geomet¬ 
ric object (tetrahedron or triangle) that it represents. The data type Tet T has the 
following members: 

• T1 — the refinement level of T ; 

• T.x = (T.x, T.y, T.z) — the x- ,y- and z-coordinates of T’s anchor node, also 
sometimes referred to as T.Xq, T.xi, and T.X 2 \ 

• T.b — the type of T. 

In 2D computations the parameter T.z is not present. To avoid confusion we use the 
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notation x to denote vectors in and x (without arrow) for integers, thus numbers 
in Z. From Corollary 7 we know that the values stored in a Tet suffice to uniquely 
identify a d-simplex T € T. 

Remark 20 (Storage requirement). The algorithms that we present in this section 
only need this data as input for a simplex resulting in a fixed storage size per Tet. If, 
for example, the maximum level C is 32 or less, then the coordinates can he stored 
in one A-byte integer per dimension, while the level and type occupy one byte each, 
leading to a total storage of 

, 2x4 + l + l = 10 bytes per Tet in 2D, 

^ ' 3x4 + 1 + 1 = 14 bytes per Tet in 3D. 

Remark 21 (Runtime). Most of these algorithms run in constant time indepen¬ 
dent of the maximum level C. The only operations using a loop over the level C orT.i, 
thus having 0{C) runtime, are computing the consecutive index from a Tet and ini¬ 
tializing a Tet according to a given consecutive index. Hence, we show how to replace 
repetitive calls of these relatively involved algorithms by more efficient constant-time 
ones. 

4.1. The coordinates of a d-simplex. The coordinates of the d + 1 nodes of 
a d-simplex T can be obtained easily from its Tet-id, the relation (3), and simple 
arithmetic: If T is a d-simplex of level type b and anchor node xq G Z'^, then 

(45) T = 2^-^Sb-hXo. 

Hence, in order to compute the coordinates of T we can take the coordinates of St, 
as given in (3), and then use relation (45). A closer look at (3) reveals that it is not 
necessary to examine all coordinates of Sb in order to compute the Xi, but that they 
can also be computed arithmetically. This computation is carried out in Algorithm 

4.1. 


Algorithm 4.1: Coordinates(Tet T) 


1 

A ^ (T.f, 0,0,0); 






2 

h ^ 2^-^; 






3 


/* Replace 

with i T.b 

for 

2D 

*/ 

4 

if T.b %2 = 0 then 






5 

j ^ (f + 2) % 3 






6 

else 






7 

L (f + l)%3 






8 

^[1] i — -^[0] “1“ hci] 






9 

X[2] •<— A[l] + hej; /* Replace 

with X[2] •<— 

X[0] + {h,h)^ 

for 

2D 


10 

A[3] ^ A[0] + {h,h,hY-, 

/* Remove this line 

for 

2D 


11 

return X; 







4.2. Parent and child. In this section we describe how to compute the Tet-ids 
of the parent P{T) and of the 2'^ children Ti, 0 < i < 2'^, of a given d-simplex T G Td- 
Computing the anchor node coordinates of the parent is easy, since their first T.i — 1 
bits correspond to the coordinates of T’s anchor node and the rest of their bits is zero. 
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Fig. 8. The type of the parent of a d-simplex T can be determined from T's cube-id c and type 
b. Left: The values of Pt from (46) in the 2D case. Middle: The values of the function Pt in the 
3D case. Right: Two examples showing the computation in 3D. (1) The small tetrahedron in the 
top left corner (orange) has cube-id 5 and type 4, and its parent (green) can be seen to have type 5. 
(2) The small tetrahedron at the bottom right (red) has cube-id 3 and type S, and its parent (blue) 
has type 2. 


For computing the type of P{T)^ we need the function 

Pt: {0,...,2'^-l} X {0,...,d!-l} ^ {0,...,d!-l}, 

^ ’ (cube-id(T),T.6) ^ P.b, 

giving the type of T’s parent in dependence of its cube-id and type. In Figure 8 we 
list all values of this function for d G { 2, 3 }. 


Algorithm 4.2: c-id(Tet r,int i) 

1 2^-^ 


2 i\= (T.x&d)?l: 0 


3 i \= lT.ykh)12-. 0 


4 i \= h.z&/i)?4: 0 

/* Remove this line for 2D */ 

5 return i 



Algorithm 4.3: Parent(Tet T) 

1 





2 

PI ^TI-l 



3 

P.x T.x & 

-^h 



4 

P.y T.y & 




5 

P.z ^ T.z & 

-^h 

/* Remove this line for 2D 

*/ 

6 

P.b Pt(c-i 

.d(r, T.l),T.b) 

/* See (46) and Figure 8 for Pt 

*/ 

7 

return P 





The algorithm Parent to compute the parent of T now puts these two ideas 
together, computing the coordinates and type of P{T). Algorithm 4.3 shows an 
implementation. It uses Algorithm 4.2 to compute the cube-id of a d-simplex. 

For the computation of one child Ti of T for a given i £ { 0,..., 2^^ — 1 } we look 
at Bey’s definition of the subsimplices in (2) and see that in order to compute the 
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anchor node of the child we need to know some of the node coordinates xq,. .. ,Xd 
of the parent simplex T. These can be obtained via Algorithm 4.1. However, it is 
more efficient to compute only those coordinates of T that are actually necessary. 
To compute the Tet-id of Ti we also need to know its type. The type of Ti depends 
only on the type of T, and in the algorithm we use the function Ct (children type) 
to compute this type. Ct is effectively an evaluation of Table 1. Algorithm 4.4 shows 
now how to compute the coordinates of the Ah child of T in Bey’s order. 

When we would like to compute the Ah child of a d-simplex T of type b with 
respect to the tetrahedral Morton order (thus the child Tk of T with /ioc(T/c) = i) 
we just call Algorithm 4.4 with as input. The permutations are available 

from Table 2; see (43) and Algorithm 4.5. 


Algorithm 4.4: Child(Tet r,int i) 

1 A -i— Coordinates(T) 

2 if J = 0 then jA— 0 

3 else if z e { 1, 4, 5 } then j t— 1 

4 else if z £ { 2,6, 7 } then j t— 2 

5 else if z = 3 then j •(— 3 /* If z = 3 then j 1 for 2D */ 

6 T,.f ^ i(A[0]+A[j]) 

7 Ti.b-<r- Ct{T.b,i) /* See Table 1 */ 




Algorithm 4.5: TM-Child(Tet T,int z) 


1 return Child (T, 

/* See Table 2 */ 


4.3. Neighbor simplices. Many applications—e.g., finite element methods— 
need to gather information about the face-neighboring simplices of a given simplex 
in a refinement. In this section we describe a level-independent constant-runtime 
algorithm to compute the Tet-id of the same level neighbor along a specific face / of 
a given d-simplex T. This algorithm is very lightweight since it only requires a few 
arithmetic computations involving the Tet-id of T and the number /. In comparison 
to other approaches to finding neighbors in constant time [11,32], our algorithm does 
not involve the computation of any of T’s ancestors. 

The d -I- 1 faces of a d-simplex T — [xq, ■. ■, Xd\ are numbered /o, ■ • ■, /d in such 
a way that face fi is the face not containing the node x^. To examine the situation 
where two d-simplices of the same level share a common face, let denote a uniform 
refinement of of a given level 0 < £ < £, 

(47) Td ■= { T I r is a descendant of of level .^ } C 7d- 
Td can be seen as the disjoint union of all the 7^^: 

(48) Td=\J Tl 

f.=o 

Given a d-simplex T G Tj and a face number z £ { 0,..., d }, denote T’s neighbor in 
Td across face / = /i by A//(T), and denote the face number of the neighbor simplex 
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Type 1 


Type 4 


Fig. 9. A tetrahedron T of type 5 (in the middle, red) and its four face neighbors (blue) of 
types 1,0,4, and 3, drawn with their associated cubes (exploded view). We see that the type ofT’s 
neighbors depends only on its type, while their node coordinates relative to T’s depend additionally 
on T’s level. (Color available online.) 


Aff(T) across which T is its neighbor by /(T). Hence, the relation 
(49) = T 

holds for each face / of T. 

Our aim is to compute the Tet-id of Nf{T) and f{T) from the Tet-id of T. Using 
the underlying cube structure this problem can be solved for each occuring type of 
d-simplex separately, and the solution scheme is independent of the coordinates of T 
and of £. In Figure 9 the situation for a tetrahedron of type 5 is illustrated, and in 
Tables 3 and 4 we present the general solution for each type. 

Using these tables, a constant-time computation of the Tet-id of A//(T) and of 
/(T) from the Tet-id of T is possible, and the 3D case is carried out in Algorithm 4.6. 
Note that this algorithm uses arithmetic expressions in T.b to avoid the sixfold dis¬ 
tinction of cases. 

Remark 22. To find existing neighbors in a nonuniform refinement we use Al¬ 
gorithm 4-6 in combination with Parent and TM-Child and comparison functions. 

Of course it is possible that Nf{T) does not belong to Tj any longer. If this is 
the case, then f was part of the boundary of the root simplex T^. We describe in the 
next section how we can decide in constant time whether Aff{T) is in or not. 

For completeness, we summarize the geometry and numbers of d-simplices tou¬ 
ching each other via a corner (d = 2 or d = 3) or edge (only d = 3). In this paper we 
do not list these neighboring tetrahedra in detail. 

For d = 3 each corner in the mesh has 24 adjacent tetrahedra; thus each 
tetrahedron has at each corner 23 other tetrahedra that share this particular corner. 
For d = 2 the situation is similar, with six triangles meeting at each corner. To 
examine the number of adjacent tetrahedra to an edge we distinguish three types of 
edges in 7)|: 

1 . edges that are also edges in the underlying hexahedral mesh; 

2 . edges that are the diagonal of a side of a cube in the hexahedral mesh; 

3. edges that correspond to the inner diagonal of a cube in the hexahedral mesh. 
Edges of the first and third kind have six adjacent tetrahedra each, and edges of the 
second kind do have four adjacent tetrahedra each. 

4.4. The exterior of the root simplex. When computing neighboring d- 
simplices it is possible that the neighbor in question does not belong to the root 
simplex T° but lies outside of it. If we look at face-neighbors of a d-simplex T, the 
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2D 

/ 

0 

1 

2 


Uf.b 

1 

1 

1 


Aff.x 

T.x + h 

T.x 

T.x 

T.b = 0 

Nf.y 

T.y 

T.y 

T.y-h 


f 

2 

1 

0 


Nf.b 

0 

0 

0 


Aff.x 

T.x 

T.x 

T.x — h 

T.b = 1 

Aff.y 

T.y + h 

T.y 

T.y 


/ 

2 

1 

0 


Table 3 


Face neighbors in 2D, For each possible type b & {0,1} of a triangle T and each of its faces 
f = fit i S {0,1,2}, the type, anchor node coordinates, and corresponding face number f ofT’s 
neighbor across f are shown. In the 2D case we can directly compute Af.b = 1 — T.b and f = 2 — f. 
Here, h = 2^~^ refers to the length of one square of level t. 
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Uf.b 
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Mf.b 

5 

4 
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T.x 
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T.x 

T.x 




Mf.x 

T.x 

T.x 

T.x 

T.x — h 

T.b = 1 

Mf.y 

T.y 

T.y 

T.y 

T.y 



T.b = 4: 

Mf.y 

T.y 

T.y 

T.y 

T.y 


Mf.z 

T.z 

T.z 

T.z 

T.z- 

h 



Mf.z 

T.z + h 

T.z 

T.z 

T.z 


f 

3 

1 

2 

0 




f 

3 

1 

2 

0 


Mf.b 

0 

1 

3 

4 




Mf.b 

1 

0 

4 

3 


Mf.x 

T.x 

T.x 

T.x 

T.x 




Mf.x 

T.x 

T.x 

T.x 

T.x 

T.b = 2 

Mf.y 

T.y + h 

T.y 

T.y 

T.y 



T.b = 5 

Mf.y 

T.y 

T.y 

T.y 

T.y-h 


Mf.z 

T.z 

T.z 

T.z 

T.z- 

h 



Mf.z 

T.z + h 

T.z 

T.z 

T.z 


f 

3 

1 

2 

0 




/ 

3 

1 

2 

0 


Table 4 


Face neighbors in 3D. For each possible type b £ { 0,1, 2, 3, 4, 5 } of a tetrahedron T and each 
of its faces / = /i,i£{0,1,2,3} the type Mf{T).b of T’s neighbor across f, its coordinates of the 
anchor node Aff{T).x, Mf{T).y, and the corresponding face number f{T), across which T 

is Mf(T) ’s neighbor, are shown. 


fact that the considered neighbor lies outside means that the respective face was on 
the boundary of T^. In order to check whether a computed d-simplex is outside the 
base simplex, we investigate a more general problem: Given anchor node coordinates 
(xotVo)^ € respectively {xo,yo, zq)'^ € of type b a level £, decide whether the 
corresponding c?-simplex N lies inside or outside of the root tetrahedron T°: N gTJ 
OT N ^ Ty ■ At the end of this section we furthermore generalize this to the problem 
of deciding for any two d-simplices N and T whether or not N lies outside of T. We 
solve this problem in constant time and independent of the levels of N and T. 

We examine the 3D case. Looking at we observe that two of its boundary faces 
correspond to faces of the root cube, namely, the intersections of Tg with the y = 0 
and the x = 2^ planes. The other two boundary faces of Tg are the intersections with 
the X = z and the y = z planes. Thus, the boundary of Tg can be described as the 
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Algorithm 4.6: Face-neighbor(Tet T,int /) 


1 

2 

3 

4 

5 

6 

7 

8 
9 

10 

11 

12 

13 

14 

15 
10 
17 


b i — T.b^ Xq ^— T.Xq^ X\ i — T.X\j X2 ^— T.X2 

if / = 1 or f = 2 then 

I /^/if(6%2 = 0 and / = 2) or {b %2 ^ 0 and / = 1) then 

I b b + 1 

else 

^b^b -1 

else 

/^3-/ 

h ^ 2 ^-^-^ 

if / = 0 then /* 

i ^ b div 2 
Xi •(— T.Xi + h 

b^b+{b%2 = 1?2 : 4) 

else /* 

i (& -I- 3) % 6 div 2 

Xi ■<r- T.Xi — h 

&^6-F(&%2 = 0?2:4) 


/ = 0 


/ = 3 


*/ 


*/ 


18 return {xo,Xi,X 2 ,b% 6 , f) 


intersection of Tg with those planes. We refer to the latter two planes as Ei and £ 2 - 
Let A be a tetrahedron with anchor node (xq, yo,-^o)^ G of type b and level 
i and denote with {xi,yi,Zi)’^ the coordinates of node i of N. Since {xi,yi,Zi)^ > 
(cco, ?/ 0 ;- 2 ^ 0 )^ (componentwise), we directly conclude that if Xg > 2 ^ or yg < 0 then 
N ^ Tg. Because the outer normal vectors of Tg on the two faces intersecting with 
El and E 2 are 


(50) 



and n 2 = 

72 



we also conclude that A ^ Ts if zq — xq > 0 or ?/o — -^o > 0. Now we have already 
covered all the cases except those where the anchor node of A lies directly in Eg or 
A 2 . In these cases we cannot solve the problem by looking at the coordinates of the 
anchor node alone, since there exist tetrahedra T' G Eg with anchor nodes lying in one 
of these planes (see Figure 10 for an illustration of the analogous case in 2D). This 
depends on the type of the tetrahedron in question. We observe that a tetrahedron 
T' G Eg with anchor node lying in Eg can have the types 0 ,1, or 2, and a tetrahedron 
with anchor node lying in E 2 can have the types 0, 4, or 5. We conclude that to 
check whether A is outside of the root tetrahedron we have to check if any one of six 
conditions is fulfilled. In fact these conditions fit into the general form below with 
Xi = X, Xj = y, Xk = z, and T as the root tetrahedron; thus T.x = T.y = T.z = 0 and 
T.b = 0. 

These generalized conditions solve the problem to check for any two given tetra¬ 
hedra A and T, whether A lies outside of T or not. 

Proposition 23. Given two d-simplices A, T with N.£ > T.i, then A is outside 
of the simplex T—which is equivalent to saying that A is no descendant of T—if and 
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2D 

T.b 

0 1 

3D 

T.b 

0 1 2 3 4 5 

Xi 

Xj 

X y 

y X 

Xi 

Xj 

Xk 

X X y y z z 

y z z X X y 

z y X z y X 



Table 5 


Following the general scheme described in this section to compute whether a given d-simplex N 
lies outside of another given d-simplex T, we give the coordinates Xi, Xj, and x^ in dependence of 
the type of T. 


only if at least one of the following conditions is fulfilled. 
For 2D, 

( 51 a) N.xi - T.Xi > 2 ^-'^-^, 

( 51 b) 

N.Xj — T.Xj < 0, 

( 51 c) {N.Xj — T.Xj) — {N.Xi — T.Xi) > Oj 
( 51 d) 

N.Xi — T.Xi = N.Xj — T.Xj and N.b = | ^ 


For 3D, 
(52a) 


N.Xi - T.x^ > 2 ^-^-^ 

(52b) 

N.Xj — T.Xj < 0, 

(52c) 

{N.Xk - T.Xk) - (N.Xi - T.Xi) > 0, 
(52d) 

{N.Xj — T.Xj) — {N.Xk — T.Xk) > Oj 


(52e) 

N.Xj — T.Xj = N.Xk — T.Xk 


and -/V.6 e 


{r.6 + l,T.6 + 2,T.6 + 3}, 
{r.6-l,T.6-2,T.6-3}, 


(521) 

N.Xk 


and 


- T.Xk 
N.h e 


= N.Xi — T.Xi 

f {T.b-l,T.b-2,T.b-3}, 
\ {T.b + l,T.b + 2,T.b + 3}, 


if T.b is even, 
if T.b is odd, 


if T.b is even, 
if T.b is odd. 


(52g) 

N.Xj — T.Xj = N.Xk — T.Xk and N.Xi — T.Xi = N.Xk — T.Xk 


and N.b ^ T.b 


The coordinates Xi, Xj, and Xk are chosen in dependence of the type of T according 
to Table 5. 
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Fig. 10. A uniform level 2 refinement of 
the triangle in 2D and triangles lying outside 
of it with their anchor nodes marked. When 
deciding whether a triangle with given anchor 
node coordinates lies outside of there are 
four cases to consider, one for each face ofT^. 
For the two faces lying parallel to the X-axis, 
respectively, Y-axis, it suffices to check whether 
the x-coordinate is greater than or equal to 2^, 
or whether the y-coordinate is smaller than 0. 
Similarly one can conclude that the triangle lies 
outside ofT^ if its x-coordinate is smaller than 
its y-coordinate. If both coordinates agree (and 
none of the previous cases applies) then the 
given triangle is outside T® if and only if its 
type is 1. 


Proof. By shifting N by {—T.x) we reduce the problem to checking whether the 
shifted d-simplex lies outside of a d-simplex with anchor node 0, level T.f and type 
T.b. For d = 3 the proof is analogous to the above argument, where we considered 
the case b = 0 and £ = 0. In two dimensions the situation is even simpler, since there 
exists only one face of the root triangle that is not a coordinate axis (see Figure 10). □ 

4.5. A consecutive index. In contrast to the Morton index for cubes, the TM- 
index for d-simplices does not produce a consecutive range of numbers. Therefore, 
two simplices T and T' of level £ that are direct successors/predecessors with respect 
to the tetrahedral Morton order do not necessarily fulfill m(T) = m{T') ± or 

m{T) = m(T') ± 1. For d = 3 this follows directly from the fact that each bfi occupies 
three bits, but there are only six values that each V can assume, since there are only 
six different types. In 2D this follows from the fact that not every combination of 
anchor node coordinates and type can occur for triangles in 72, the triangle with 
anchor node (0,0) and type 1 being one example. This also means that the largest 
occurring TM-index is bigger than 2'^^ — 1. 

Constructing a consecutive index that respects the order given by the TM-index 
is possible, as we show in this section. Since in practice it is more convenient to work 
with this consecutive index instead of the TM-index, our aim is to construct for each 
level £ a consecutive index I{T) e { 0,..., 2^^^ — 1 } , such that 

(53) \fT,SeTi: I{T) < I{S) miT) <miS). 

This index can also be understood as a bijection 

(54) /: { m e No I m = m(T) for a T e 7);^ } ^ { 0,..., 2'^'^ - 1 } , 

mapping the TM-indices of level £ d-simplices to a consecutive range of numbers. It is 
obvious that I{T^) = 0. The index I{T) can be easily computed as the Cdigit 2‘^-ary 
number consisting of the local indices as digits, thus 

(55) /(T) = (/ioc(ri),...,/ioc(T0)2- 

Algorithm 4.7 shows an implementation of this computation. It can be done 
directly from the Tet-id of T, and thus it is not necessary to compute the TM-index 
of T first. 
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Algorithm 4.7: /(TetT) 


1 

2 

3 

4 

5 


I ^O.b^T.b 
for T.i > j > 1 do 
c -i— c-id(T, i) 

I^I + 8UUc) 

b <— Pt(c, b) 


0 return I 


/* See Table 6; multiply with 4® for 2D */ 


The inverse operation of computing T from I{T) and a given level i can be carried 
out in a similar fashion; see Algorithm 4.8. For each 0 < * < £ we look up the type 
b and the cube-id of T® from /ioc(T'®) and the type of Parent (T®) = T® ^ (starting 
with type(T°) = 0) via Tables 7 and 8. From the cube-ids we can build up the anchor 
node coordinates of T. The last computed type is the type of T. The runtime of this 
algorithm is 0(T.£). 


Algorithm 4.8: T(consecutive index /,int £) 

1 T.x, T.y, T.z t— 0, 6 t— 0 

2 for 1 < i < £ do 


Get I\oc{T^) from I 

/* See (55) 

*/ 

cG- c-id(r®), b^T\b 

/* See Tables 7 and 8 

*/ 

if c & 1 then T.x G- T.x + 2^“® 



if c & 2 then T.y G- T.y + 2'^“® 



if c & 4 then T.z G- T.z + 2^“® 

/* Remove this line for 2D 

*/ 


8 T.b^b 

9 return T 


Similar to Algorithm 4.7 is Algorithm ls_valid, which decides whether a given 
index m G [0, 2®'^) n Z is in fact a TM-index for a tetrahedron. Thus, in the spirit of 
Section 3.5 we can decide whether a given 6D cube is in the image of map (25) that 
embeds Ts into the set of 6D subcubes of [0, 2'^]®. The runtime of ls_valid is d(£). 


Algorithm 4.9: ls valid(m € [0,2®"^) nZ,£) 

1 / ^ 0 

2 k G- 6(£ — i) 

3 for £ > i > 1 do 

4 6 ^ (TOfc,mfe + i,TOfc+2)8 

5 c ^ (mfe+3, mfc+4, mfc+5)8 

6 k G- 6(£ — z -I- 1) 

7 if (TOfc,mfe+i,mfe+ 2)8 Ft(c,6) then /* Take (0,0, 0)3 if z=l */ 

8 ^ return False 

9 return True 


The consecutive index simplifies the relation between the TM-index of a simplex 
and its position in the SFC. In the special case of a uniform mesh, the consecutive 
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iLic) 

3D 

0 

1 

cube-id 

2 3 4 

5 

6 

7 

0 

0 

1 

1 

4 

1 

4 

4 

7 

1 

0 

1 

2 

5 

2 

5 

4 

7 

b 2 

0 

2 

3 

4 

1 

6 

5 

7 

3 

0 

3 

1 

5 

2 

4 

6 

7 

4 

0 

2 

2 

6 

3 

5 

5 

7 

5 

0 

3 

3 

6 

3 

6 

6 

7 


Table 6 


Aoc(c) 

2D 

cube-id c 

0 12 3 

b 0 

1 

0 113 

0 2 2 3 


The local index of a tetrahedron T a T in dependence of its cube-id c and type b. 


cube-id(r) 

3D 

0 

1 

2 

/loc(T) 

3 4 

5 

6 

7 

0 

0 

1 

1 

1 

5 

5 

5 

7 

1 

0 

1 

1 

1 

3 

3 

3 

7 

P.b 2 

0 

2 

2 

2 

3 

3 

3 

7 

3 

0 

2 

2 

2 

6 

6 

6 

7 

4 

0 

4 

4 

4 

6 

6 

6 

7 

5 

0 

4 

4 

4 

5 

5 

5 

7 


cube-id(T) 

2D 

/loc(T) 

0 12 3 

P.b ® 

1 

0 113 

0 2 2 3 


Table 7 

For a tetrahedron T £ T of local Index whose parent P has type P.b we give the cube-id of T. 


index and the position are identical. 

4.6. Successor and predecessor. Calculating the TM-index corresponding to 
a particular consecutive index is occasionally needed in higher-level algorithms. This 
is relatively expensive, since it involves a loop over all refinement levels, thus some 
10 to 30 in extreme cases. However often the task is to compute a whole range of 
d-simplices. This occurs, for example, when creating an initial uniform refinement 
of a given mesh (see Algorithm New in Section 5.1). That is, for a given consecutive 
index /, a level £, and a count n, find the n level-£ simplices following the d-simplex 
corresponding to the consecutive index /, that is, the d-simplices corresponding to 
the n consecutive indices I, I 1,..., / n — 1. Ideally, this operation should run 
linearly in n, independent of £, but if we used Algorithm 4.8 to create each of the 
n -\- 1 simplices we would have a runtime of OinC). In order to achieve the desired 
linear runtime we introduce the operations Successor and Predecessor that run in 
average 0{l) time. These operations compute from a given d-simplex T of level i with 
consecutive index Ip the d-simplex T' whose consecutive index is /t + 1, respectively, 
/t — 1- Thus, T' is the next level i simplex in the SFC after T (resp. the previous one). 
Algorithm 4.10, which we introduce to solve this problem does not require knowledge 
about the consecutive indices It and It ± 1 and can be computed significantly faster 
than Algorithm 4.8; see Lemma 24. 

To compute the predecessor of T we only need to reverse the sign in Line 3 in the 
Successor_recursion subroutine of Algorithm 4.10. 

Lemma 24. Algorithm 4-. 10 has constant average runtime (independent of C). 

Proof. Because each operation in the algorithm can be executed in constant time, 
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T.b 

3D 

0 

1 

2 

/loc(T) 

3 4 

5 

6 

7 

0 

0 

0 

4 

5 

0 

1 

2 

0 

1 

1 

1 

2 

3 

0 

1 

5 

1 

P.b 2 

2 

0 

1 

2 

2 

3 

4 

2 

3 

3 

3 

4 

5 

1 

2 

3 

3 

4 

4 

2 

3 

4 

0 

4 

5 

4 

5 

5 

0 

1 

5 

3 

4 

5 

5 


Table 8 


T.b 

2D 

0 

/loc(T) 

1 2 

3 

P.b 

0 

0 

1 

0 

1 

1 

0 

1 

1 


For a tetrahedron T a T of local Index Roc whose parent P has type P.b we give the type ofT. 


Algorithm 4.10: Successor(Tet T) 

1 return Successor_recursion(T, T, T.^) 

Function Successor_recursioii(Tet r,Tet T',mt £) 

1 c <— c-id(T, i) 

2 From c and b look up i := I\oc{T^) /* See Table 6 */ 

3 i ^ (i + 1) % 8 

4 if I = 0 then /* Enter recursion (in rare cases) */ 

5 T' P- Successor_recursion {T,T',£— 1) 

0 bp- T'.b /* i) stores the type of T'^~^ */ 

7 else 

8 ^ b ^ Pt(c, b) 

9 From b and /loc = i look up {c',b') /* See Tables 7 and 8 */ 

10 Set the level £ entries of T'.x.T'.y and T'.z to d 

11 T'.h p- b' 

12 return T' 


the average runtime is nc, where c is a constant (independent of C) and n — 1 is the 
number of average recursion steps. Since in consecutive calls to the algorithm the 
variable i cycles through 0 to 2"^ — 1 we conclude that the recursion is on average 
executed in every 2'^th step, allowing for a geometric series argument. □ 


We see in Algorithm 4.10 the usefulness of the consecutive index. Because we are 
using this index instead of the TM-index, computing the index of the successor/pre¬ 
decessor only requires adding/subtracting 1 to the given index. On the other hand, 
computing the TM-index of a successor/predecessor would involve more subtle com¬ 
putations. 
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5. High-level AMR algorithms. To develop the complete AMR functionality 
required by numerical applications, we aim at a forest of quad-/octrees in the spirit 
of [16,28]. Key top-level algorithms are: 

• New. Given an input mesh of conforming simplices, each considered a root 
simplex, generate an initial refinement. 

• Adapt. Adapt (refine and coarsen) a mesh according to a given criterion. 

• Partition. Partition a mesh among all processes such that the load is bal¬ 
anced, possibly according to weights. 

• Balcince. Establish a 2:1 size condition between neighbors in a given refined 
mesh. The levels of any two neighboring simplices must differ by at most 1. 

• Ghost. For each process, assemble the layer of directly neighboring elements 
owned by other processes. 

• Iterate. Iterate through the local mesh, executing a callback function on 
each element and on all interelement interfaces. 

Since partitioning via SEC only uses the SEC index as information, we refer to al¬ 
ready existing descriptions of Partition for hexahedral or simplicial SFCs [16,40]. 
Balance, Ghost, and Iterate are sophisticated parallel algorithms and require addi¬ 
tional theoretical work, which is beyond the scope of this paper. 

Here, we briefly describe New and Adapt. In the forest-of-trees approach we model 
an adaptive mesh by a coarse mesh of level 0 d-simplices, the trees. Such a coarse 
mesh could be specified manually for simple geometries, or obtained from executing 
a mesh generator. Each level 0 simplex is identified with the root simplex and 
then refined adaptively to produce the fine and potentially nonconforming mesh of 
d-simplices. These simplices are partitioned among all processes; thus each process 
holds a range of trees, of which the first and last may be incomplete: Their leaves are 
divided between multiple processes. 

An entity F of the structure forest consists of the following entries 

• F.C — the coarse mesh; 

• F.K, — the process-local trees; 

• F.8k — for each local tree k the list of process-local simplices in tetrahedral 
Morton order. 

We acknowledge that New and Adapt are essentially communication-free, but still serve 
well to exercise some of the fundamental algorithms described earlier. 

5.1. New. The New algorithm creates a partitioned uniform level i refined forest 
from a given coarse mesh. To achieve this, we first compute the first and last d- 
simplices belonging to the current process p. From this range we can calculate which 
trees belong to p and for each of these trees, the consecutive index of the first and 
last d-simplices on this tree. We then create the first simplex in a tree by a call to 
T (Algorithm 4.8). In contrast to the New algorithm in [16] we create the remaining 
simplices by calls to Successor instead of T to avoid the 0{i) runtime of T in the 
case of simplices. Our numerical tests, displayed in Figure 11, show that the runtime 
of New is in fact linear in the number of elements and does not depend on the level i. 
Within the algorithm, K denotes the number of trees in the coarse mesh and P the 
number of processes. 

After New returns, the process local number of elements is known, and per-element 
data can be allocated linearly in an array of structures, or a structure of arrays, 
depending on the specifics of the application. 

5.2. Adapt. The Adapt algorithm modifies an existing forest by refining and 
coarsening the d-simplices of a given forest according to a callback function. It does 
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Algorithm 5.1: New(Coarse Mesh C, int £) 

1 n •<— 2^^^, N •<— nK ; /* * d-simplices per tree said global number of 

d-simplices */ 

2 fffirst \_Np/P\, giast ^ lN(p+ 1)/P\ — 1 ; /* Global numbers of first 

and last.. */ 


3 kfiYSt ^ Ldfirst/^J ; ^last Ldlast/^J; /* 

range */ 


4 

5 

6 

7 

8 
9 

10 

11 


for t e { fcfirst, ■ • ■ ,Mast } do 

^first (t — ^first) dfirst Tit . 0, 
Clast (t — ^last) ? dlast Tit .71 1, 

T •<—T (Cfirst,^); 

{T}; 

for e ^ I curst, ■ • ■ 5 G-iast 1} do 
T ^ Successor (T); 

£k £k ^ {T} 


..local simplex and local tree 


/* Call Algorithm 4.8 */ 


this by traversing the d-simplices of each tree in tetrahedral Morton order and passing 
them to the callback function. If the current d-simplex and its 2*^ — 1 successors form 
a family (all having the same parent), then the whole family is passed to the callback. 
This callback function accepts either one or 2^ d-simplices as input plus the index of 
the current tree. In both cases, a return value greater than zero means that the first 
input d-simplex should be refined, and thus its 2“^ children are added in tetrahedral 
Morton order to the new forest. Additionally, if the input consists of 2“^ simplices, 
they form a family, and a return value smaller than zero means that this family should 
be coarsened, thus replaced by their parent. If the callback function returns zero, the 
first given d-simplex remains unchanged and is added to the new forest, and Adapt 
continues with the next d-simplex in the current tree. The Adapt algorithm creates 
a new forest from the given one and can handle recursive refinement/coarsening. For 
the recursive part we make use of the following reasonable assumptions: 

• A d-simplex that was created in a refine step will not be coarsened during 
the same adapt call. 

• A d-simplex that was created in a coarsening step will not be refined during 
the same adapt call. 

From these assumptions we conclude that for recursive refinement we only have to 
consider those d-simplices that were created in a previous refinement step and that we 
only have to care about recursive coarsening directly after we processed a d-simplex 
that was not refined and could be the last d-simplex in a family. If refinement and 
coarsening are not done recursively, the runtime of Adapt is linear in the number of 
d-simplices of the given forest. 

An application will generally project or otherwise transform data from the previ¬ 
ous to the adapted mesh. This can be done within the adaptation callback, which is 
known to proceed linearly through the local elements, or after Adapt returns if a copy 
of the old mesh has been retained. In the latter case, one would allocate element data 
for the adapted mesh and then iterate over the old and the new data simultaneously, 
performing the projection in the order of the SFC. Once this is done, the old data 
and the previous mesh are deallocated [15]. 







A TETRAHEDRAL SPACE-FILLING CURVE 


29 


Strong scaling of New 


Runtime tests for New 


T^Procs 

Level 

T^Xetrahedra 

R.iiiitime [s] 

Factor 

64 

7 

1.073e9 

0.059 



8 

8.590c9 

0.451 

7.64 


9 

6.872el0 

3.58 

7.94 


10 

5.498ell 

28.6 

7.99 

256 

8 

8.590e9 

0.116 

- 


9 

6.S72el0 

0.898 

7.74 


10 

5.498ell 

7.15 

7.96 



Number of processes 


Fig. 11. Runtime tests for New on JUQUEEN. Left: Two strong scaling studies. A new uniform 
level 8 (circles) and level 10 (triangles) refinement of a coarse mesh of 512 root tetrahedra, carried 
out with 16 up to 2,048 processes and 1,024 '^P 131,072 processes with 16 processes per compute 

node. Right: The data shows that the runtime of New is linear in the number of generated elements 
and does not additionally depend on the level. The uniform refinement is created from a coarse 
mesh of 512 root tetrahedra. For the first computation on 64 processes we use 1 process per compute 
node and for the computation on 256 processes we use 2 processes per node. 


6. Performance evaluation. Given the design of the algorithms discussed in 
this paper, we expect runtimes that are precisely proportional to the number of el¬ 
ements and independent of the level of refinement. To verify this, we present scal¬ 
ing and runtime tests^ for New and Adapt on the JUQUEEN supercomputer at the 
Forschungszentrum Juelich [19], an IBM BlueGene/Q system with 28,672 nodes con¬ 
sisting of 16 IBM PowerPG A2 @1.6 GHz and 16 GB Ram per node. We also present 
one runtime study on the full MIRA system at the Argonne Leadership Gomputing 
Facility, which has the same architecture as JUQUEEN and 49,152 nodes. The biggest 
occurring number of mesh elements is around 8.5 x 10^^ tetrahedra with 13 million 
elements per process. 

The first two tests are a strong scaling (up to 131k processes) and a runtime 
study of New in 3D, shown in Figure 11. For both tests we use a coarse mesh of 512 
tetrahedra. We time the New algorithm with input level 8 (resp. level 10 for higher 
numbers of processes). We execute the runtime study to examine whether New has 
the proposed level-independent linear runtime in the number of generated tetrahedra, 
which can be read from the results presented in the Table in Figure 11. 

The last test is Adapt with a recursive nonuniform refinement pattern. The 
starting point for all runs is a mesh obtained by uniformly refining a coarse mesh 
of 512 tetrahedra to a given initial level k. This mesh is then refined recursively 
using a single Adapt call, where only the tetrahedra of types 0 and 3 whose level 
does not exceed the fine level A: -|- 5 are refined recursively. The resulting mesh on 
each tetrahedron resembles a fractal pattern similar to the Sierpinski tetrahedron. 
We perform several strong and weak scaling runs on JUQUEEN starting with 128 
processes and scaling up to 131,072. The setting is 16 processes per compute node. We 
finally do another strong scaling run on the full system of the MIRA supercomputer 
at the Argonne Leadership Gomputing Facility with 786,432 processes and again 16 
processes per compute node. Figure 12 shows our runtime results. 

7. Conclusion. We present a new encoding for adaptive nonconforming triangu¬ 
lar and tetrahedral mesh refinement based on Bey’s red-refinement rule. We identify 


^Version vO.l is available at https://bitbucket.org/cburstedde/t8code.git 
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Fig. 12. Strong scaling for Adapt with a fractal refinement pattern. Starting from an initial 
level k on a coarse mesh of 512 tetrahedra we refine recursively to a maximal final level fc + 5. 
The refinement callback is such that only subtetrahedra of types 0 and 3 are refined. Left: Strong 
and weak scaling on JUQUEEN with up to 131,072 processes and strong scaling on MIRA with 
up to 786,432 processes. On both systems we use 16 processes per compute node. The level 12 
mesh consists of 858,588,635,136 tetrahedra. Right: An initial level 0 and final level 3 refinement 
according to the fractal pattern. The subtetrahedra of levels 1 and 2 are transparent. 


six different types of tetrahedra (and two types of triangles) and prescribe an ordering 
of the children for each of these types that differs from Bey’s original order. By intro¬ 
ducing an embedding of the mesh elements into a Cartesian coordinate structure, we 
define a tetrahedral Morton index that can be computed using bitwise interleaving 
similar to the Morton index for cubes. This tetrahedral Morton index shares some 
properties with the well-known cubical one and allows for a memory-efficient random 
access storage of the mesh elements. 

Exploiting the Cartesian coordinate structure, we develop several constant-time 
algorithms on simplices. These include computing the parent, the children, and the 
face-neighbors of a given mesh element, as well as computing the next and previous 
elements according to the SFC. 

In view of providing a complete suite of parallel dynamic AMR capabilities, the 
constructions and algorithms described in this paper are just the beginning. A repar¬ 
titioning algorithm following our SFC, for example, is easy to imagine, but challenging 
to implement if the tree connectivity is to be partitioned dynamically, and if global 
shared metadata shall be reduced from being proportional to the number of ranks to 
the number of compute nodes. The present paper provides atomic building blocks 
that can be used in high-level algorithms for 2:1 balancing [27] and the computation 
of ghost elements and generalized topology iteration [28] ; regardless, these algorithms 
still have to be written and are likely to raise questions that we will have to address 
in future work. Notwithstanding the above, we believe that the choices presented in 
this paper are sustainable for maintaining extreme scalability in the long term. 
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